This document describes fitting a hierarchical ordination to data, including code and the details that are needed.
The data is included in the mvabund R-package, but was originally collected by Gibb et al.. It is includes abundance observations of 41 ant species at 30 sites in Australia, along with 7 environmental variables and 5 traits. The aim to is determine how the environmental variables and traits affect composition in the ecological community, including whether they interact. The methodological question is how does hierarchical ordination help.
The abundances, traits and environment are each stored in a different matrix. First we load the data and set some constants:
library(nimble)
library(nimbleHMC)
library(mvabund)
library(coda)
data("antTraits")
Y <- antTraits$abund # abundance data of sites by species
X <- scale(antTraits$env) # environmental variables of sites by predictors
qrX <- qr(X)
X <- X[,qrX$pivot[1:qrX$rank]]
TR <- scale(model.matrix(~0+.,antTraits$traits)) # species traits
qrTR <- qr(TR)
TR <- TR[,qrTR$pivot[1:qrTR$rank]]
NSites <- nrow(Y) # number of sites
NSpecies <- ncol(Y) # number of species
NTraits <- ncol(TR) # number of traits
NEnv <- ncol(X) # number of environmental predictors
# create data lists for Nimble
dat <- list(Y = Y, X = X, TR = TR)
consts <- list(NSites = NSites, NEnv = NEnv, NTraits = NTraits, NSpecies = NSpecies)
rm(NEnv, NTraits, NSpecies, NSites)
The data are counts of each species so we assume they follow a Poisson distribution with a log link function, as we would do in a standard generalised linear model. We assume that each species has a different mean abundance (i.e. for each species \(j\) we have a different intercept \(\beta_{0j}\)), and model the rest of the variation with a hierarchical ordination. This gives the following mean model on the link scale (with linear predictor \(\eta_{ij}\)):
\[ \eta_{ij} = \beta_{0j} + \boldsymbol{z}_i^\top \boldsymbol{\Sigma} \boldsymbol{\gamma}_j. \]
As with any ordination, \(\boldsymbol{z}\) and \(\boldsymbol{\gamma}_j\) are the site scores and species loadings, and the columns of \(\boldsymbol{Z}\) are the latent variables (which holds the site scores as the rows). Here we assume that they each have a variance of one, so that \(\boldsymbol{\Sigma}\) holds the variation of the latent variables: it will typically be a diagonal matrix, and if any of the terms on the diagonal are close to zero, this suggests that latent variable has (almost) no effect. It thus provides a straightforward summary of the relative importance of that latent variable, and is similar to the square root of eigenvalues (singular values) in an eigenanalysis.
If we stopped here, this would be a standard Generalized Linear Latent Variable Model. But, here we want to model both \(\boldsymbol{z}_i\) and \(\boldsymbol{\gamma}_j\), i.e. go from modelling model groups of species responding in similar ways to sites to modelling how the traits of the species affect their responses to the environment.
As a simplification, we can think about this as simply a regression of \(\boldsymbol{z}_i\) (the site effects) against the environmental variables and \(\boldsymbol{\gamma}_j\) against the traits. In reality, it is more complicated, because \(\boldsymbol{z}_i\) and \(\boldsymbol{\gamma}_j\) are estimated in the model, so their uncertainty needs to be propagated.
We denote the abundance at site \(i = 1 \dots n\) for species \(j = 1 \dots p\) as \(Y_{ij}\), or as a matrix as \(\boldsymbol{Y}\). The environmental variables are \(x_{ik}\) for the \(k = 1 \ldots K\) predictors (or \(\boldsymbol{X}\) as a matrix), and \(t = 1\ldots T\) traits as \(\boldsymbol{TR}\). We then assume
\[Y_{ij} \sim \text{Pois}(\lambda_{ij}),\]
with
\[\text{log}(\lambda_{ij}) = \eta_{ij}.\]
Consequently, \(\eta_{ij}\) is the linear predictor, which we further model with the hierarchical ordination:
\[ \eta_{ij} = \beta_{0j} + \boldsymbol{z}_i^\top \boldsymbol{\Sigma} \boldsymbol{\gamma}_j. \]
We then model \(\boldsymbol{z}_i\) (as in van der Veen et al. (2023)) and \(\boldsymbol{\gamma}_j\) hierarchically:
\[ \boldsymbol{z}_i = \boldsymbol{B}^\top\boldsymbol{x}_i + \boldsymbol{\epsilon}_i \] and
\[ \boldsymbol{\gamma}_j = \boldsymbol{\omega}^\top\boldsymbol{TR}_{j} + \boldsymbol{\varepsilon}_j \] where:
We fit the model with the Nimble R-package. We start with a single dimension for simplicity, so that we can show the steps needed.
To fit the model we need to 1) code up the likelihood, 2) specify the Markov Chain Monte carlo samplers and initial values, 3) run it. We parallelise running the MCMC so that each chain is run on a different processor. This means we need a function to run one chain, and then another to run all of the chains together.
OneLV.nim <- nimbleCode({
for (i in 1:NSites) {
for (j in 1:NSpecies) {
# These three lines specify the model:
Y[i, j] ~ dpois(lambda[i, j]) # Likelihood: Y follows a Poisson distribution
log(lambda[i, j]) <- eta[i, j] # Specify the log link function
eta[i,j] <- beta0[j] + gamma[j]*sd.LV*z[i] # linear predictor: species + LV
}
# Site scores: regression against environmental effects
# the Bs are the coefficients of the environmental effects
epsilonSTAR[i] ~ dnorm(0, sd = sd.Site) # Residual site effect
XB[i] <- inprod(X[i, 1:NEnv],BSTAR[1:NEnv])
zSTAR[i] <- XB[i] + epsilonSTAR[i]
}
for(j in 1:NSpecies) {
# Species effects: regression against trait effects
# The Os are the trait effects.
varepsilonSTAR[j] ~ dnorm(0, sd = sd.Species) # Residual
omegaTR[j] <- inprod(TR[j, 1:NTraits],OSTAR[1:NTraits])
gammaSTAR[j] <- omegaTR[j] + varepsilonSTAR[j]
beta0[j] ~ dnorm(0, sd = 1) # Species means
}
# Here we standardise z and gamma, so the both have a variance of 1.
# The standard deviation of the latent variable is sd.LV.
StdDev.z <- sd(zSTAR[1:NSites])
StdDev.gamma <- sd(gammaSTAR[1:NSpecies])
z[1:NSites] <- zSTAR[1:NSites]/StdDev.z
gamma[1:NSpecies] <- gammaSTAR[1:NSpecies]/StdDev.gamma
# Scale other parameters
epsilon[1:NSites] <- epsilonSTAR[1:NSites]/StdDev.z
B[1:NEnv] <- BSTAR[1:NEnv]/StdDev.z
varepsilon[1:NSpecies] <- varepsilonSTAR[1:NSpecies]/StdDev.gamma
O[1:NTraits] <- OSTAR[1:NTraits]/StdDev.gamma
# priors for scales
sd.Site ~ dexp(1)
sd.Species ~ dexp(1)
sd.LV ~ dexp(1)
for(k in 1:NEnv){
BSTAR[k] ~ dnorm(0, sd = 1)
}
for(tr in 1:NTraits){
OSTAR[tr] ~ dnorm(0, sd = 1)
}
})
inits <- function(consts){
B = rnorm(consts$NEnv)
O = rnorm(consts$NTraits)
varepsilon = rnorm(consts$NSpecies)
epsilon = rnorm(consts$NSites)
list(
BSTAR = B,
OSTAR = O,
epsilonSTAR = epsilon,
varepsilonSTAR = varepsilon,
sd.Site = rexp(1),
sd.Species = rexp(1),
beta0 = rnorm(consts$NSpecies),
sd.LV = rexp(1)
)
}
# Function to run one chain: it can be done with HMC or other MCMC algorithms.
# "block" can be used to specify blocking structures for the slice sampler
# "slice" can be used to specify parameters on which to apply univariate Metropolis-Hastings
# parameters that are not in "block" or "slice" are HMCed
RunOneChain <- function(seed, dat, code, inits, consts, ToMonitor=NULL,
Nburn=5e3, NIter=5.5e4, Nthin=10, block = NULL, slice = NULL, ...) {
require(nimble)
require(nimbleHMC)
AllSamplers <- HMCsamplers <- c('epsilonSTAR', 'varepsilonSTAR', 'beta0', 'OSTAR', 'BSTAR',
'sd.Site', 'sd.Species', 'sd.LV')
if(!is.null(block)){
HMCsamplers <- HMCsamplers[!HMCsamplers%in%unique(gsub("\\s*\\[[^\\)]+\\]","",c(unlist(block),unlist(slice))))]
}
if(is.null(ToMonitor)) {
ToMonitor <- c("beta0", "sd.Species", "sd.Site", "sd.LV", "B", "O",
"epsilon", "varepsilon", "gamma", "z")
}
mod <- nimbleModel(code = code, name = "HO", constants = consts,
inits = inits(consts), data = dat, buildDerivs = TRUE)
model <- compileNimble(mod)
# Do HMC
conf <- configureHMC(model, nodes = HMCsamplers, monitors = ToMonitor, print = FALSE,
control=list(nwarmup=Nburn))
if(!is.null(block)) {
if(is.list(block)){
# Use a slice everything that not being HMCed
lapply(block, conf$addSampler, type = "AF_slice")
}else{
# Use a slice everything that not being HMCed
sapply(block, conf$addSampler, type = "AF_slice")
}
}
if(!is.null(slice)) {
if(is.list(slice)){
# Use a slice everything that not being HMCed
lapply(slice, conf$addSampler, type = "slice")
}else{
# Use a slice everything that not being HMCed
sapply(slice, conf$addSampler, type = "slice")
}
}
mcmc <- buildMCMC(conf)
cmcmc <- compileNimble(mcmc, project = model)
res <- runMCMC(cmcmc, niter=NIter, nburnin = Nburn, thin=Nthin,
nchains = 1, samplesAsCodaMCMC = TRUE, ...)
return(res)
}
# Function to run parallel chains
ParaNimble <- function(NChains, ...) {
opts <- list(...)
if(!is.null(opts$seeds) && (length(opts$seeds) == NChains)){
seeds <- opts$seeds
opts <- opts[-which(names(opts)=="seeds")]
}else{
seeds <- 1:NChains
}
require(parallel)
nimble_cluster <- makeCluster(NChains)
samples <- parLapply(cl = nimble_cluster, X = seeds, ...)
stopCluster(nimble_cluster)
# Name the chains in the list
chains <- setNames(samples,paste0("chain", 1:length(samples)))
chains
}
# Function to create list of names for parameters to use a block slice sampler for
MakeBlockList <- function(consts, LVwise = TRUE){
# builds list for slice AF sampling
# with LVwise = TRUE we are LV-wise blocking B and epsilon, and O and varepsilon
# otherwise, LVs are jointly blocked
if(LVwise & consts$nLVs>1){
blockList <- c(sapply(1:consts$nLVs,
function(x,consts){
c(paste0("BSTAR[", 1:consts$NEnv,", ",x, "]"),
paste0("epsilonSTAR[", 1:consts$NSites,", ",x, "]"))
}
,
consts = consts,simplify=F),
sapply(1:consts$nLVs,
function(x,consts){
c(paste0("OSTAR[", 1:consts$NTraits,", ",x, "]"),
paste0("varepsilonSTAR[", 1:consts$NSpecies,", ",x, "]"))
}
,
consts = consts,simplify=F)
)
if(consts$nLVs>1){
# remove entries of B,O-STAR that are fixed
for(q in 2:consts$nLVs){
blockList[[q]] <- blockList[[q]][-(1:(q-1))]
blockList[-c(1:consts$nLVs)][[q]] <- blockList[-c(1:consts$nLVs)][[q]][-(1:(q-1))]
}
}
}else{
blockList = list(c("BSTAR","epsilonSTAR"),c("OSTAR","varepsilonSTAR"))
}
blockList
}
Latent variable models are notorious for being unidentifiable, you can get the same mean abundances from different combinations of the parameters. We have to make some adjustments to the model to account for this: some of this is done in the model fitting, but for others it is easier to do it after we obtain the posterior samples.
The functions to swap the signs are included below. We want to make one parameter positive, so ideally we want to do this to a parameter that has both modes away from zero. We can identify this in an ad hoc way: for each chain for every parameter we calculate the proportion of iterations where the sign is positive, and then for every parameter we calculate the variance in that proportion. If a parameter is centered around 0 the mean proportion will be about 0.5, and will not vary much, whereas if it is some way from 0 the mean will be close to 0 or 1, and the variance will be high. ALternativley, we coiuld look at the largest \(|p - 0.5|\), or we can vissually identify such parameters from the posterior samples. There may be even better alternatives.
# Utility Function to get logical indicators for if names contains a string in v
GetInds <- function(v, names) {
if(length(v)==1) {
res <- grep(v, names)
} else {
res <- c(unlist(sapply(v, grep, x=names)))
}
res
}
# Function to swap signs of all variables varinds to have same sign as vartosign.
ReSignChain <- function(chain, varinds, vartosign) {
# MeanSign <- sign(mean(chain[ ,s])) # Might need this
res <- t(apply(chain, 1, function(x, vs, vi) {
Names <- names(x)[vi]
if(any(grepl(",", Names))) {
lvind <- gsub(".*, ", "", Names)
} else {
lvind <- seq_along(vs)
}
x[vi] <- x[vi]*sign(x[vs[lvind]])
x
}, vs=vartosign, vi=varinds))
as.mcmc(res)
}
# Function to post-process chains to swap signs.
postProcess <- function(Chains, VarsToProcess, VarsToSwapBy = NULL, VarToSign=NULL, print=FALSE, rule = 2) {
if(is.null(VarToSign)) VarToSign <- VarsToProcess
SignInd <- GetInds(VarToSign, colnames(Chains[[1]]))
# Get indicators for all variables to have their signs changed
ProcessInds <- GetInds(VarsToProcess, names = colnames(Chains[[1]]))
# Check if > 1 LV
SeveralLVs <- any(grepl(",", colnames(Chains[[1]])[SignInd]))
LV <- gsub(".*, ", "", colnames(Chains[[1]])[SignInd])
if(rule==1){
# Calculate variance of mean of indicator of sign:
# hopefully largest is variable with most sign swapping (i.e. )
Signs <- as.data.frame(lapply(Chains, function(mat, ind) {
colMeans(mat[,ind]>0)
}, ind=SignInd))
VarSign <- apply(Signs, 1, var)
if(SeveralLVs) {
LV <- gsub(".*, ", "", colnames(Chains[[1]])[SignInd])
#Chose variables who's sign will be used to swap other signs
if(is.null(VarsToSwapBy)){
VarsToSwapBy <- sapply(unique(LV), function(lv, vs) {
vv <- vs[grep(lv, names(vs))]
nm <- names(which(vv==max(vv)))
if(length(nm)>1) nm <- nm[1] # probably something more sophisticated is better
nm
}, vs = VarSign, simplify = TRUE)
}
} else { # only 1 LV
if(is.null(VarsToSwapBy)){
#Chose variables who's sign will be used to swap other signs
VarsToSwapBy <- names(which(VarSign==max(VarSign)))[1]
}
}
}else if(rule==2){
require(mousetrap)
jointChains <- do.call(rbind, Chains)
# bimodality score
bms<-apply(jointChains[,SignInd],2,bimodality_coefficient)
# find maximum bimodality score
if(SeveralLVs){
lstSgn <- sapply(unique(LV),function(lv)which.max(bms[grep(lv,names(bms))]),simplify=F)
# formatting
names(lstSgn) <- NULL
VarsToSwapBy <- names(unlist(lstSgn))
names(VarsToSwapBy) <- paste0(sort(unique(LV)))
}else{
lstSgn <- which.max(bms)
# formatting
VarsToSwapBy <- names(lstSgn)
names(VarsToSwapBy) <- "1]"
}
}
if(print) message(paste0("Swapping by ", paste(VarsToSwapBy, collapse=", ")))
chains.sgn <- lapply(Chains, ReSignChain, vartosign=VarsToSwapBy, varinds=ProcessInds)
as.mcmc.list(chains.sgn)
}
Finally, we can run the MCMC.
consts$nLVs = 1
chains.unsgn <- ParaNimble(4, fun = RunOneChain,
dat = dat,
code = OneLV.nim,
inits = inits,
# Nburn = 5e1, NIter = 5.5e2, Nthin = 1, # for a small run
Nburn = 1e3, NIter = 2e3, Nthin = 1, # for a big HMC run
consts = consts,
block = MakeBlockList(consts), slice = paste0("beta0[",1:consts$NSpecies,"]")) # HMC on the rest (hypers)
# post-process chains for sign-swapping
VarsToProcess <- c("^B", "^O", "^epsilon", "^varepsilon", "^z", "^gamma")
VarsToSign <- c("^B")
chains <- postProcess(chains.unsgn, VarsToProcess = VarsToProcess,
VarToSign = VarsToSign, print = TRUE, rule = 2)#, VarsToSwapBy = "B[2]")
summ.HO = summary(chains)
There are a lot of parameters, so a lot of results to look at. Here we concentrate on \(\boldsymbol{B}\), \(\boldsymbol{\omega}\), \(\boldsymbol{\sigma}\) and \(\boldsymbol{\delta}\), and a latent variable-plot.
First the environmental effects:
library(basicMCMCplots)
chainsPlot(chains, var = c("B"), legend = F,traceplot = T)
Then the trait effects:
chainsPlot(chains, var = c("O"), legend = F,traceplot = TRUE)
These look quite OK (after post-processing the signs). Now we can create a plot of the site scores against their indices to inspect the results.
PlotPost <- function(var, summ, varnames=NULL, ...) {
vars <- grep(var, rownames(summ$statistics))
if(is.null(varnames)) varnames <- rownames(summ$statistics)[grep(var, rownames(summ$statistics))]
if(length(varnames)!=length(vars))
stop(paste0("Number of variable names, ", length(varnames), "not the same as number of variables, ",
length(vars)))
plot(summ$statistics[vars,"Mean"], 1:length(vars),
xlim=range(summ$quantiles[vars,]), yaxt="n", ...)
segments(summ$quantiles[vars,"2.5%"], 1:length(vars), summ$quantiles[vars,"97.5%"], 1:length(vars))
segments(summ$quantiles[vars,"25%"], 1:length(vars), summ$quantiles[vars,"75%"], 1:length(vars), lwd=3)
abline(v=0, lty=3)
axis(2, at=1:length(vars), labels=varnames, las=1)
}
par(mar = c(4.1, 4.5, 1, 1))
PlotPost(var = "B", summ = summ.HO, varnames = gsub("\\.", "\n", colnames(X)),
xlab = "Environmental Effect", ylab = "")
We can see that both Coarse Woody Debris (CWD) and canopy cover have positive effects on the latent variable, so ants that have a positive species score will be more abundant in areas with more canopy cover and CWD.
We can also look at the trait effects. The bottom line is that there is too much uncertainty to say anything.
par(mar=c(4.1, 7, 1, 1))
PlotPost(var = "O", summ = summ.HO, varnames = gsub("\\.", "\n", colnames(TR)),
xlab = "Trait Effect", ylab = "")
We can also calculate the full distributions of site and species scores, and plot them. We see that, well, there is variation, so the trait effects are not uncertain because the species scores are uncertain: it is because their effects are small.
par(mfrow = c(2, 1), mar = c(2, 12, 4, 1))
PlotPost("^z", summ = summ.HO, varnames = NULL, ylab = "", xlab = "Latent variable 1", main = "Sites")
PlotPost("^gamma", summ = summ.HO, varnames = NULL, ylab = "", xlab = "Latent variable 1", main = "Species")
More than one dimension requires adding extra identifiability constraints. Now that we have two or more latent variables, we also have to worry about their rotation.
There are various ways to place the constraints, some more numerically stable than others. Generally, we note that the model on the link scale is similar to existing matrix decompositions, so that much about the required constraints can be learned from those.
To prevent rotational invariance, we require \(\boldsymbol{B}\) to have zeros above the main diagonal. Note that for a number of latent variables greater than the number of predictors \(K\), we need to add extra constraints to \(\boldsymbol{\omega}\). van der Veen et al (2023) assumed \(\boldsymbol{B}\) to be a semi-orthogonal matrix instead, as they encountered numerical instability with the constraints that we impose here. However, their constraint would require sampling on constrained parameter spaces, which is difficult, while this constraint formulation allows to use standard out-of-the-box MCMC samplers (also, it seems to work fine here). Note, that when $d p,t $ the model is rotationally invariant, and additional constraints will need to be added.# Model code
# Update our constants from before with the new number of LVs, rest remains the same
# Model code
HO.nim <- nimbleCode({
for (i in 1:NSites) {
for (j in 1:NSpecies) {
eta[i,j] <- beta0[j] + sum(gamma[j,1:nLVs]*sd.LV[1:nLVs]*z[i,1:nLVs])
log(lambda[i, j]) <- eta[i, j]
Y[i, j] ~ dpois(lambda[i, j])
}
for (l in 1:nLVs) {
XB[i, l] <- sum(X[i, 1:NEnv]*BSTAR[1:NEnv, l])
epsilonSTAR[i,l] ~ dnorm(0,sd.Site[l])#Residual
zSTAR[i,l] <- XB[i,l] + epsilonSTAR[i,l]
}
}
for(j in 1:NSpecies) {
for (l in 1:nLVs) {
omegaTR[j, l] <- sum(TR[j, 1:NTraits]*OSTAR[1:NTraits, l])
varepsilonSTAR[j,l] ~ dnorm(0,sd.Species[l]) # Residual
gammaSTAR[j,l] <- omegaTR[j,l] + varepsilonSTAR[j,l]
}
beta0[j] ~ dnorm(0, sd=1)
}
# Constraints to 0 on upper diagonal
# stole some code from Boral for this - thanks Francis
for(l in 1:(nLVs-1)) {
for(ll in (l+1):(nLVs)) {
BSTAR[l,ll] <- 0
OSTAR[l,ll] <- 0
}
}
for(l in 1:nLVs) {
# diagonal elements
BSTAR[l,l] ~ dnorm(0,sd=1)
OSTAR[l,l] ~ dnorm(0,sd=1)
## standardizing z and gamma
StdDev.z[l] <- sd(zSTAR[1:NSites,l])
z[1:NSites,l] <- zSTAR[1:NSites,l]/StdDev.z[l]
StdDev.gamma[l] <- sd(gammaSTAR[1:NSpecies,l])
gamma[1:NSpecies,l] <- gammaSTAR[1:NSpecies,l]/StdDev.gamma[l]
# Scale other parameters
epsilon[1:NSites,l] <- epsilonSTAR[1:NSites,l]/StdDev.z[l]
B[1:NEnv,l] <- BSTAR[1:NEnv,l]/StdDev.z[l]
varepsilon[1:NSpecies,l] <- varepsilonSTAR[1:NSpecies,l]/StdDev.gamma[l]
O[1:NTraits,l] <- OSTAR[1:NTraits,l]/StdDev.gamma[l]
# priors for scales
sd.Site[l] ~ dexp(1)
sd.Species[l] ~ dexp(1)
sd.LV[l] ~ dexp(1)
}
## Free lower diagonals
for(l in 2:nLVs) {
for(ll in 1:(l-1)) {
OSTAR[l,ll] ~ dnorm(0,sd=1)
BSTAR[l,ll] ~ dnorm(0,sd=1)
}
}
for(l in (nLVs+1):NEnv) {
for(ll in 1:(nLVs)) {
BSTAR[l,ll] ~dnorm(0,1)
}
} ## All other elements
for(l in (nLVs+1):NTraits) {
for(ll in 1:(nLVs)) {
OSTAR[l,ll] ~dnorm(0,1)
}
} ## All other elements
})
# inits <- function(consts, dat, seed){
# start.fit <- gllvm::gllvm(dat$Y, X=dat$X, num.lv.c = consts$nLVs, family = "poisson", seed = seed, sd.errors = FALSE)
# qrB <- qr(t(start.fit$params$LvXcoef)) # Rotate to B[1,2]=0
# B = t(qr.R(qr(t(start.fit$params$LvXcoef))))
# start.fit$params$theta <- start.fit$params$theta%*%qr.Q(qrB)
# mod.TR <- lm(start.fit$params$theta~0+dat$TR)
# O = coef(mod.TR)
# varepsilon = residuals(mod.TR)
# epsilon = start.fit$lvs%*%qr.Q(qrB)
# list(
# BSTAR = B,
# OSTAR = O,
# epsilonSTAR = epsilon,
# varepsilonSTAR = varepsilon,
# sd.Site = start.fit$params$sigma.lv,
# sd.Species = rexp(consts$nLVs),
# beta0 = rnorm(consts$NSpecies),
# sd.LV = rexp(consts$nLVs)
# )
# }
inits <- function(consts){
B = matrix(rnorm(consts$nLVs*consts$NEnv),ncol=consts$nLVs)
B[upper.tri(B)] = 0
O = matrix(rnorm(consts$nLVs*consts$NTraits),nrow=consts$NTraits)
varepsilon = mvtnorm::rmvnorm(consts$NSpecies,rep(0,consts$nLVs),diag(consts$nLVs))
epsilon = mvtnorm::rmvnorm(consts$NSites,rep(0,consts$nLVs),diag(consts$nLVs))
list(
BSTAR = B,
OSTAR = O,
epsilonSTAR = epsilon,
varepsilonSTAR = varepsilon,
sd.Site = rexp(consts$nLVs),
sd.Species = rexp(consts$nLVs),
beta0 = rnorm(consts$NSpecies),
sd.LV = rexp(consts$nLVs)
)
}
Our model is:
\[ \eta_{ij} = \beta_{0j} + \boldsymbol{z}_i^\top \boldsymbol{\Sigma} \boldsymbol{\gamma}_j. \]
But we can rotate by a rotation matrix \(M\) this to get an equivalent model:
\[ \eta_{ij} = \beta_{0j} + \boldsymbol{z}_j^\top \boldsymbol{M} \boldsymbol{\Sigma} \boldsymbol{M}^\top\boldsymbol{\gamma}_i. \] There are many possible rotations (see the https://cran.r-project.org/web/packages/GPArotation/index.html package for a long list). Here we will just rotate abased on a singular value decomposition (SVD) of the latent variables, as in the gllvm R-package Niku et al.. This rotates the ordination so that most of the variation in \(\boldsymbol{z}_i^\top \boldsymbol{\Sigma}\) is put on the first latent variable, and so that the second latent variable has the next largest amount of variance. Note, that this does not guarantee that these are the latent variables that explain most variation in the response, unlike in eigenanalysis.
In detail, for each draw from the posterior we:
The first step sweeps the scale of the LVs and loadings into the LV-specific scale so it still represents the total variance of the latent variables.
# Function to convert a variable in an MCMC chain that should be a matrix from
# a vector to the right matrix
ChainToMatrix <- function(ch, name) {
v <- ch[grep(paste0("^", name, "\\["), names(ch))]
l.row <- as.numeric(gsub(".*\\[", "", gsub(",.*", "", names(v))))
l.col <- as.numeric(gsub(".*, ", "", gsub("\\]", "", names(v))))
mat <- matrix(0, nrow=max(l.row), ncol=max(l.col))
for(i in seq_along(ch)) { # there must be a quicker way...
mat[l.row[i], l.col[i]] <- v[i]
}
mat
}
# Function to convert a variable in an MCMC chain that should be a vector to a diagonal matrix
ChainToDiag <- function(ch, name) {
v <- ch[grep(paste0("^", name, "\\["), names(ch))]
mat <- diag(v)
mat
}
# Function to rotate a latent variable, defaults to Varimax. Methods in GPArotation package
# ch: one draw from MCMC chain
# RotateBy: name of variable to rotate by (i.e. to calculate the rotation matrix for)
# SDby: name of Standard deviations
# ToRotate: Variables to rotate (e.g. epsilons, gamma/z).
# SDToRotate: Standatd devations to rotate
# rotation.fn: Function in GPArotation to calcualte rotation, or svd. Defaults to svd.
# scale: Should RotateBy be scaled by SDby before rotating? Defaults to FALSE
rotate2LV <- function(ch, RotateBy, SDby, ToRotate=NULL, SDToRotate=NULL, rotation.fn="svd",
scale=FALSE) {
require(GPArotation)
# Function to rotate matrices & return full vector or a matrix (retmat)
RotateVar <- function(name, vec, rotmat, retmat=FALSE) {
mat <- ChainToMatrix(ch=vec, name=name)
rotated <- mat%*%rotmat
if(retmat) {
res <- rotated
} else {
vec[grep(paste0("^", name, "\\["), names(vec))] <- c(rotated)
res <- vec
}
res
}
# Function to rotate (diagonal) matrices & return the diagonal
RotateSD <- function(name, vec, rotmat) {
mat <- diag(vec[grep(paste0("^", name, "\\["), names(vec))])
rotated <- mat%*%rotmat
vec[grep(paste0("^", name, "\\["), names(vec))] <- diag(rotated)
vec
}
# Extract SDs and latent variables
SDs <- ch[grep(SDby, names(ch))]
LVmat <- ChainToMatrix(ch, RotateBy)
if(scale) LVmat <- sweep(LVmat, 2, SDs, "*")
# Calculate a rotation matrix
if(rotation.fn!="svd"){
loadings.r <- do.call(rotation.fn, list(A=LVmat)) # rotate
}else{
rot <- svd(LVmat)$v
loadings.r <- list(Th=rot, loadings = LVmat%*%rot)
}
# Retrieve correct scale of LVs
if(scale){
#also need to put LVs into here
ch[grep(paste0("^", RotateBy, "\\["), names(ch))] <- scale(loadings.r$loadings, center = F)
ch[grep(paste0("^", SDby, "\\["), names(ch))] <- attr(scale(loadings.r$loadings, center = F),"scaled:scale")
} else{
# if LVs are not scaled by LV sds, we just add the new LV scale to LVs SDs.
ch[grep(paste0("^", RotateBy, "\\["), names(ch))] <- scale(loadings.r$loadings, center = F)
ch[grep(paste0("^", SDby, "\\["), names(ch))] <- attr(scale(loadings.r$loadings, center = F),"scaled:scale")*SDs
}
# Rotate rest of the variables
for(nm in unique(ToRotate)) {
ch <- RotateVar(vec=ch, name=nm, rotmat=loadings.r$Th, retmat = FALSE)
}
for(nm in unique(SDToRotate)) {
ch <- RotateSD(vec=ch, name=nm, rotmat=loadings.r$Th)
}
ch
}
RotateChains <- function(mcmc.lst, RotateBy="z", SDby = "sd.LV",
ToRotate=c("B", "epsilon", "gamma", "O", "varepsilon"),
SDToRotate=c("sd.Site", "sd.Species"), ...) {
if(SDby%in%SDToRotate)stop("'SDby' should not be present in 'SDToRotate'.")
if(RotateBy%in%ToRotate)stop("'RotateBy' should not be present in 'ToRotate'.")
rotated <- lapply(mcmc.lst, function(mcmc) {
rot <- apply(mcmc, 1, rotate2LV, RotateBy=RotateBy, SDby = SDby,
ToRotate=ToRotate, SDToRotate=SDToRotate, ...)
as.mcmc(t(rot))
})
as.mcmc.list(rotated)
}
RescaleVars <- function(vec, ScaleBy, SDTorescale,ToRescale) {
if(!any(c("z","gamma")%in%ScaleBy))stop("Not a valid choice.")
if("sd.LV"%in%SDTorescale)stop("Not a valid choice.")
vec2 <- vec
# get scale from z or gamma
ScBy <- ChainToMatrix(vec, ScaleBy)
SD <- apply(ScBy, 2, sd)
for(nm in unique(c(ScaleBy, ToRescale))) {
Sc.x <- ChainToMatrix(vec, nm)
Sc.x.sc <- sweep(Sc.x, 2, SD, "/")
vec2[grep(paste0("^", nm, "\\["), names(vec2))] <- c(Sc.x.sc)
}
# scale sd separately
for(nm in SDTorescale) {
SD.x <- vec[grep(paste0("^", nm, "\\["), names(vec))]
vec2[grep(paste0("^", nm, "\\["), names(vec2))] <- SD.x/SD
}
# scale into sd.LV
# sd.LV <- vec[grep("sd.LV", names(vec))]
# vec2[grep("sd.LV", names(vec2))] <- SD*sd.LV
vec2
}
RescaleChains <- function(mcmc.lst, ...) {
rescale <- lapply(mcmc.lst, function(mcmc) {
rot <- apply(mcmc, 1, RescaleVars, ...)
as.mcmc(t(rot))
})
as.mcmc.list(rescale)
}
We can run the fitting with the same code as before:
consts$nLVs <- nLVs <- 2
HO.2LV <- ParaNimble(4, fun = RunOneChain,
dat = dat,
code = HO.nim,
inits = inits, block = MakeBlockList(consts),
# Nburn=5e1, NIter=5.5e2, Nthin=1, # for a small run
Nburn = 1000, NIter = 4e3, Nthin = 1, # for a full run
consts = consts)
# Re-scale gamma
chains2LV.sc <- RescaleChains(HO.2LV, ScaleBy = "gamma",
SDTorescale = c("sd.Species"),
ToRescale = c("gamma", "O", "varepsilon"))
# Re-scale Z
chains2LV.sc2 <- RescaleChains(chains2LV.sc, ScaleBy = "z",
SDTorescale = c("sd.Site"),
ToRescale = c("z", "B", "epsilon"))
# Now rotate based on Z*Sigma and return its scale as LVsd.
chains2LV.r <- RotateChains(chains2LV.sc2, rotation.fn = "svd", scale = T)
## Loading required package: GPArotation
# post-process chains for sign-swapping
chains2LV <- postProcess(chains2LV.r, VarsToProcess = VarsToProcess, rule = 2, print=T)
## Swapping by z[25, 1], z[5, 2]
# Calculate summaries
summ.HO2LV <- summary(chains2LV)
rm(chains2LV.sc, chains2LV.sc2, chains2LV.r)
Environment first:
chainsPlot(chains2LV, var = c("B"), legend = F, traceplot=TRUE)
And the trait effects:
chainsPlot(chains2LV, var = c("O"), legend = F, traceplot=TRUE)
What we are really interested in, is looking at the environment-trait interactions. We can plot those with the following code.
# Function to get MCMC results for reduced-rank approximated interaction term
getMCMCInt <- function(mcmc.lst, dat,...) {
int.mcmc <- lapply(mcmc.lst, function(mcmc){
inCoefs <- as.mcmc(t(apply(mcmc, 1, function(ch){
B <- ChainToMatrix(ch,"B")
O <- ChainToMatrix(ch,"O")
SDs <- diag(ch[grep("sd.LV",gsub("\\s*\\[[^\\)]+\\]","",names(ch)))])
coefs <- B%*%SDs%*%t(O)
vec <- c(coefs)
names(vec) <- paste0(colnames(dat$X),":",rep(colnames(dat$TR),each=ncol(dat$X)))
vec
})))
}
)
as.mcmc.list(int.mcmc)
}
intChains <- getMCMCInt(chains2LV, dat = dat)
summ.EnvTraitInt <- summary(intChains)
coefs <- matrix(summ.EnvTraitInt$statistics[,1], ncol = ncol(dat$TR), nrow=ncol(dat$X), dimnames = list(colnames(dat$X),colnames(dat$TR)))
a <- 0.2
colort <- colorRampPalette(c("blue", "white", "red"))
lattice::levelplot(coefs, xlab = "Environmental Variables",
ylab = "Species traits", col.regions = colort(100), cex.lab = 1.3,
at = seq(-a, a, length = 100), scales = list(x = list(rot = 45)))
Now, we can make two-dimensional ordination plots of sites and species, with their predictor effects. Note that these have been rotated, so that most ofthe variance should be on the first axis.
AddArrows <- function(coords, marg= par("usr"), col=2) {
origin <- c(mean(marg[1:2]), mean(marg[3:4]))
Xlength <- sum(abs(marg[1:2]))/2
Ylength <- sum(abs(marg[3:4]))/2
ends <- coords / max(abs(coords)) * min(Xlength, Ylength) * .8
arrows(
x0 = origin[1],
y0 = origin[2],
x1 = ends[,
1] + origin[1],
y1 = ends[, 2] + origin[2],
col = col,
length = 0.1)
text(
x = origin[1] + ends[, 1] * 1.1,
y = origin[2] + ends[, 2] * 1.1,
labels = rownames(coords),
col = col)
}
ExtractMeans <- function(mcmc.lst, name=NULL) { # this needs defensive programming
if(is.null(name)) {
ind <- 1:nrow(mcmc.lst$statistics)
} else {
ind <- grep(name, rownames(mcmc.lst$statistics))
}
Means <- mcmc.lst$statistics[ind,"Mean"]
if(any(grep(",", names(Means)))) {
Col <- gsub("]", "", unique(gsub(".*, ", "", names(Means))))
res <- sapply(Col, function(cc, mn) {
mn[grep(paste0(cc,"]"), names(mn))]
}, Means)
} else {
res <- Means
}
res
}
GetMeans <- function(summ, name, d=1) {
if(is.null(d)) d <- 1
var <- summ$statistics[grep(paste0('^', name), rownames(summ$statistics)),"Mean"]
matrix(var,ncol=d)
}
# LVMeans <- ExtractMeans(LVs, name = "^LV\\[")
# gammaMeans <- ExtractMeans(gamma, name="^gamma")
z.m <- GetMeans(summ.HO2LV, name="^z", d=consts$nLVs)
gamma.m <- GetMeans(summ.HO2LV, name="^gamma", d=consts$nLVs)
vareps.m <- GetMeans(summ.HO2LV, name="^varepsilon", d=consts$nLVs)
eps.m <- GetMeans(summ.HO2LV, name="^epsilon", d=consts$nLVs)
B.m <- GetMeans(summ.HO2LV, name="B", d=consts$nLVs)
row.names(B.m) <- colnames(dat$X)
O.m <- GetMeans(summ.HO2LV, name="O", d=consts$nLVs)
row.names(O.m) <- colnames(dat$TR)
par(mfrow=c(2,1))
#LVs
plot(z.m,type="n", xlab="Latent variable 1", ylab="Latent variable 2", main = "Sites")
text(z.m,labels=1:consts$NSites)
AddArrows(marg = par("usr"), coords=B.m, col="red")
#gammas
plot(gamma.m,type="n", xlab="Latent variable 1", ylab="Latent variable 2", main = "Species")
text(gamma.m,labels=vegan::make.cepnames(colnames(Y)))
AddArrows(marg = par("usr"), coords=O.m, col="blue")
We can see that almost all of the environmental effects seem to act in the same way, mainly on LV1.
Ratio <- chains2LV$chain1[,"sd.LV[1]"]/chains2LV$chain1[,"sd.LV[2]"]
Ratio.u <- HO.2LV$chain1[,"sd.LV[1]"]/HO.2LV$chain1[,"sd.LV[2]"]
SDs <- summ.HO2LV$statistics[grep("sd", rownames(summ.HO2LV$statistic)),]
rownames(SDs) <- c(paste0("Latent Variable ", 1:2),
paste0("Site ", 1:2),
paste0("Species ", 1:2))
knitr::kable(SDs, digits=2, format = "html", table.attr = "style='width:70%;'")
| Mean | SD | Naive SE | Time-series SE | |
|---|---|---|---|---|
| Latent Variable 1 | 0.95 | 0.07 | 0 | 0.00 |
| Latent Variable 2 | 0.56 | 0.06 | 0 | 0.00 |
| Site 1 | 0.21 | 0.47 | 0 | 0.03 |
| Site 2 | -0.07 | 0.45 | 0 | 0.02 |
| Species 1 | 0.03 | 0.08 | 0 | 0.01 |
| Species 2 | -0.02 | 0.09 | 0 | 0.01 |
*INSERT VAREXPL INFO**
These tell us how well the predictors explain the ordination; the species- and site-specific scale parameters would be zero if the predictors fully explained the ordination. The scale parameters for the latent variables are similar to singular values (the square root of eigenvalues) in a classical ordination; they reflect a dimension its importance to the response.
The covariance of species is determined by traits, LV-specific variation and sites’ residual variation, and the covariance between sites is determined by the environment, LV-specific variation and species’ residual variation.
The residual covariance matrix of the model (where you calculate species associations from) is determined by three terms:
where \(\mathcal{K}_0\) denotes the zero order modified Bessel function of the second kind. The first term tells us that correlations between species are determined by the environment. The second term tells us that the correlation between sites is determined by species traits. The last term induces no correlation between sites and species for diagonal \(\boldsymbol{\Sigma}\), but serves to scale species associations. Consequently, the covariance for species \(j,j2\) and site \(i,i2\) is: \[\begin{multline} \Sigma_{i,i2,j,j2} = \text{cov}(\boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_j,\boldsymbol{x}_{i2}^\top\boldsymbol{B}\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j2}) + \text{cov}(\boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_j,\boldsymbol{tr}_{j2}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\boldsymbol{\epsilon}_{i2}) + \text{cov}(\boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_j,\boldsymbol{\epsilon}_{i2}^\top\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j2}) + \text{cov}(\boldsymbol{tr}_j^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\boldsymbol{\epsilon}_i,\boldsymbol{x}_{i2}^\top\boldsymbol{B}\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j2}) + \text{cov}(\boldsymbol{tr}_{j}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\boldsymbol{\epsilon}_{i},\boldsymbol{tr}_{j2}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\boldsymbol{\epsilon}_{i2}) + \\ \text{cov}(\boldsymbol{tr}_j^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\boldsymbol{\epsilon}_i,\boldsymbol{\epsilon}_{i2}^\top \boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j2}) + \text{cov}(\boldsymbol{\epsilon}_{i}^\top \boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j},\boldsymbol{x}_{i2}^\top\boldsymbol{B}\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j2}) + \text{cov}(\boldsymbol{\epsilon}_{i}^\top \boldsymbol{\Sigma} \boldsymbol{\varepsilon}_{j},\boldsymbol{tr}_{j2}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\boldsymbol{\epsilon}_{i2})+ \text{cov}(\boldsymbol{\epsilon}_i^\top \boldsymbol{\Sigma}\boldsymbol{\varepsilon}_j,\boldsymbol{\epsilon}_{i2}^\top \boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j2}), \end{multline}\] third order terms are zero for central normal random variables, so this simplifies to: \[\begin{equation} \Sigma_{i,i2,j,j2} = \boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\text{cov}(\boldsymbol{\varepsilon}_j,\boldsymbol{\varepsilon}_{j2}) \boldsymbol{\Sigma}\boldsymbol{B}^\top\boldsymbol{x}_{i2}+ \boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\text{cov}(\boldsymbol{\varepsilon}_j,\boldsymbol{\epsilon}_{i2})\boldsymbol{\Sigma}\boldsymbol{\omega}^\top\boldsymbol{tr}_{j2} + \boldsymbol{tr}_j^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\text{cov}(\boldsymbol{\epsilon}_i,\boldsymbol{\varepsilon}_{j2})\boldsymbol{\Sigma}\boldsymbol{B}^\top\boldsymbol{x}_{i2} + \boldsymbol{tr}_{j}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\text{cov}(\boldsymbol{\epsilon}_{i},\boldsymbol{\epsilon}_{i2})\boldsymbol{\Sigma}\boldsymbol{\omega}^\top\boldsymbol{tr}_{j2} + \text{cov}(\boldsymbol{\epsilon}_i^\top \boldsymbol{\Sigma}\boldsymbol{\varepsilon}_j,\boldsymbol{\epsilon}_{i2}^\top \boldsymbol{\Sigma} \boldsymbol{\varepsilon}_{j2}). \end{equation}\] Here, \(\text{cov}(\boldsymbol{\varepsilon}_j,\boldsymbol{\epsilon}_{i2}) = \text{cov}(\boldsymbol{\epsilon}_i,\boldsymbol{\varepsilon}_{j2}) = 0\), and for \(\text{cov}(\boldsymbol{\epsilon}_i^\top\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_j,\boldsymbol{\epsilon}_{i2}^\top\boldsymbol{\Sigma}\boldsymbol{\varepsilon}_{j2}) = 2\text{tr}(\text{diag}(\boldsymbol{\delta}^2)\boldsymbol{\Sigma}\text{diag}(\boldsymbol{\sigma}^2)\boldsymbol{\Sigma})\). Further, \(\text{cov}(\boldsymbol{\varepsilon}_j,\boldsymbol{\varepsilon}_{j2})\) is zero for \(j \neq j2\) and \(\text{diag}(\boldsymbol{\delta}^2)\) otherwise, and similar for \(\text{cov}(\boldsymbol{\epsilon}_i,\boldsymbol{\epsilon}_{i2})\) .
Consequently, for a species association matrix where we consider the block of the covariance matrix where \(i = i2\), or for the site-to-site matrix where we consider the block \(j = j2\), we have:
\[\begin{split} \Sigma_{j,j2} &= \boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\text{cov}(\boldsymbol{\varepsilon}_j,\boldsymbol{\varepsilon}_{j2}) \boldsymbol{\Sigma}\boldsymbol{B}^\top\boldsymbol{x}_{i} + \text{cov}(\boldsymbol{\epsilon}_i^\top \boldsymbol{\Sigma}\boldsymbol{\varepsilon}_j,\boldsymbol{\epsilon}_{i}^\top \boldsymbol{\Sigma} \boldsymbol{\varepsilon}_{j2}) + \boldsymbol{tr}_{j}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\text{var}(\boldsymbol{\epsilon}_{i},\boldsymbol{\epsilon}_{i})\boldsymbol{\Sigma}\boldsymbol{\omega}^\top\boldsymbol{tr}_{j2} \\ &= \boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\text{cov}(\boldsymbol{\varepsilon}_j,\boldsymbol{\varepsilon}_{j2}) \boldsymbol{\Sigma}\boldsymbol{B}^\top\boldsymbol{x}_{i} + \boldsymbol{tr}_{j}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\text{diag}(\boldsymbol{\sigma}^2)\boldsymbol{\Sigma}\boldsymbol{\omega}^\top\boldsymbol{tr}_{j2}+ 2\text{tr}\{\text{diag}(\boldsymbol{\delta}^2)\boldsymbol{\Sigma}\text{diag}(\boldsymbol{\sigma}^2)\boldsymbol{\Sigma}\}, \end{split}\]and
\[\begin{equation} \Sigma_{i,i2} = \boldsymbol{tr}_{j}^\top\boldsymbol{\omega}\boldsymbol{\Sigma}\text{cov}(\boldsymbol{\epsilon}_{i},\boldsymbol{\epsilon}_{i2})\boldsymbol{\Sigma}\boldsymbol{\omega}^\top\boldsymbol{tr}_{j} + \boldsymbol{x}_i^\top\boldsymbol{B}\boldsymbol{\Sigma}\text{diag}(\boldsymbol{\delta}^2)\boldsymbol{\Sigma}\boldsymbol{B}^\top\boldsymbol{x}_{i2} + 2\text{tr}\{\text{diag}(\boldsymbol{\delta}^2)\boldsymbol{\Sigma}\text{diag}(\boldsymbol{\sigma}^2)\boldsymbol{\Sigma}\}. \end{equation}\]We can visualize those matrices here for (e.g.,) the first site and first species, respectively.
#species
spec.mat <- matrix(0,nrow=consts$NSpecies,ncol=consts$NSpecies)
Sigma <- diag(SDs[grep("Latent Variable", rownames(SDs)),1]^2)
Sigma.sp <- diag(SDs[grep("Species", rownames(SDs)),1]^2)
Sigma.si <- diag(SDs[grep("Site", rownames(SDs)),1]^2)
for(j in 1:consts$NSpecies){
for(j2 in 1:consts$NSpecies){
if(j==j2){
spec.mat[j,j2] = X[1,,drop=F]%*%B.m%*%Sigma%*%Sigma.sp%*%t(B.m)%*%t(X[1,,drop=F])
}
spec.mat[j,j2] = spec.mat[j,j2] + TR[j,,drop=F]%*%O.m%*%Sigma%*%Sigma.si%*%Sigma%*%t(O.m)%*%t(TR[j2,,drop=F]) + 2*sum(diag((Sigma.sp%*%Sigma%*%Sigma.si%*%Sigma)))
}
}
spec.cor.mat <- cov2cor(spec.mat)
colnames(spec.cor.mat) <- row.names(spec.cor.mat) <- colnames(Y)
corrplot::corrplot(spec.cor.mat,type = "lower",order = "AOE", main = "Species", mar = c(1,1,1,1),tl.srt=45,tl.cex = .5)
#sites
site.mat <- matrix(0,nrow=consts$NSites,ncol=consts$NSites)
for(i in 1:consts$NSites){
for(i2 in 1:consts$NSites){
if(i==i2){
site.mat[i,i2] = TR[1,,drop=F]%*%O.m%*%Sigma%*%Sigma.si%*%t(O.m)%*%t(TR[1,,drop=F])
}
site.mat[i,i2] = site.mat[i,i2] + X[i,,drop=F]%*%B.m%*%Sigma%*%Sigma.sp%*%Sigma%*%t(B.m)%*%t(X[i2,,drop=F]) + 2*sum(diag((Sigma.sp%*%Sigma%*%Sigma.si%*%Sigma)))
}
}
site.cor.mat <- cov2cor(site.mat)
colnames(site.cor.mat) <- row.names(site.cor.mat) <- paste("Site", 1:consts$NSites)
corrplot::corrplot(site.cor.mat,type = "lower",order = "AOE", main = "Sites", mar = c(3,3,3,3),tl.srt=45,tl.cex = .5)